Seizure classification with selected frequency bands and EEG montages: a Natural Language Processing approach

Individualized treatment is crucial for epileptic patients with different types of seizures. The differences among patients impact the drug choice as well as the surgery procedure. With the advance in machine learning, automatic seizure detection can ease the manual time-consuming and labor-intensive procedure for diagnose seizure in the clinical setting. In this paper, we present an electroencephalography (EEG) frequency bands (sub-bands) and montages selection (sub-zones) method for classifier training that exploits Natural Language Processing from individual patients’ clinical report. The proposed approach is targeting for individualized treatment. We integrated the prior knowledge from patient’s reports into the classifier-building process, mimicking the authentic thinking process of experienced neurologist’s when diagnosing seizure using EEG. The keywords from clinical documents are mapped to the EEG data in terms of frequency bands and scalp EEG electrodes. The data of experiments are from the Temple University Hospital EEG seizure corpus, and the dataset is divided based on each group of patients with same seizure type and same recording electrode references. The classifier includes Random Forest, Support Vector Machine and Multi-Layer Perceptron. The classification performance indicates that competitive results can be achieve with a small portion of EEG the data. Using the sub-zones selection for Generalized Seizures (GNSZ) on all three electrodes, data are reduced by nearly 50% while the performance metrics remain at the same level with the whole frequency and zones. Moreover, when selecting by sub-zones and sub-bands together for GNSZ with Linked Ears reference, the data range reduced to 0.3% of whole range, and the performance deviates less than 3% from the results with whole range of data. Results show that using proposed approach may lead to more efficient implementations of the seizure classifier to be executed on power-efficient devices for long lasting real-time seizures detection.

along the scalp (scalp EEG). Given the low-cost and non-invasive nature of scalp EEG, it is still a widely used tool for probing neural functions.
The gold standard to identify seizures is the visual recognition by a trained neurophysiologist using the EEG data where the abnormal electrical morphology is discovered. This manual procedure is labor-intensive as well as time-consuming in the clinical setting, it is subject to electrical signals interference by external noise and artifacts, and the subjective nature of such analysis can lead to disagreement among neurophysiologists.
Automated seizure detection from EEG recordings have been investigated by researchers since 1970s [4]. Models are built to distinguish patterns in brain signals that manifest of epileptic seizures. The models are framed in two typical steps: feature engineering and classification of the ictal/inter-ictal (during seizure/ in-between seizures) signals. For a real-world seizure detection problem, the machine learning classification models need to be built with cost-consciousness trying to avoid the intermediate steps for feature computation that have high computational cost. The frequency domain features have proved to be more computationally efficient than time domain [5] and time-frequency domain features.
Background EEG frequency band (α, β, θ , δ, γ ) oscillations have been intensively studied in brain normal function. These frequency bands can be recorded during state of wake or sleep: occipital alpha frequency activity (α) observed with a relax state while eyes are closed, frontal and central beta (β) and gamma oscillations (γ ) during alert and vigilance mental state, theta frequency activity (θ) during sleep or memory tasks, and frontal delta rhythm (δ) that was recorded while sleep. Nevertheless, the seemingly normal EEG background bands contain clear abnormalities which have been shown as a significant prognostic tool [6][7][8][9].
The selection of EEG channels/montages is widely studied [10] for faster detection and noise removal. The EEG montages refers to the electrodes located on scalps connecting the patient and recording device. Since brain signal conduct in a non-linear and dynamic manner, the recorded electrical voltage is impacted by electrode locations significantly [11]. Neurologists select the channels using their prior knowledge and for an efficient approach it is vital to select the montages which carry the most discriminative information. It has been demonstrated that it is effective to select only a small number of montages for seizure detection [12,13]. For computer scientists, the process of selecting montages requires additional steps of generation and evaluation of certain electrodes. The subset channels are generated from whole set using various statistical measures [14][15][16].
In this study, we use a natural language processing approach for the efficient selection of frequency bands (sub-bands) and scalp EEG electrodes (sub-zones). By consolidating each patient's clinical report, we aim to integrate the medics' prior knowledge into the classifierbuilding process. In particular, we classify seizure ictal/ inter-ictal phases with sub-bands and sub-zones selection from six designed inputs. The three types of frequency band inputs are: the whole frequency range provided in data corpus, the background frequency EEG bands (α, β, θ, δ, γ ) , and the selected background bands based on keywords extracted from patients' clinical reports by Natural Language Processing (NLP). We also introduce scalp EEG electrodes reduction by using the electrodes keywords (pre-frontal, frontal, temporal, parietal, occipital, central) extracted from individual's clinical reports.
The research questions that we aim to answer using out novel approach can be synthesized as follows: • RQ1: How do the selection of frequency bands (α, β, θ, δ, γ ) influence seizure classification? • RQ2: How do the selection of EEG electrodes (Fp, F, T, P, O, C) influence seizure classification? • RQ3: How to integrate prior knowledge from experts to build individualized seizure classification models?
In this work, to answer the above stated questions, we (i) present the evidence of background EEG in brain functionality during seizures, (ii) illustrate that selective background frequency bands and EEG electrodes coupling can lead to better seizure classification results, and finally (iii) build a resource-efficient model targeting for individualized seizure classification purpose. In Sect. 2, we provide background knowledge of the paper. In Sect. 3, we discuss related works. In Sect. 4, we introduce the publicly available dataset used in this work. In Sect. 5, we introduce the design thinking process, the methods including pre-processing using Short-Time Fourier Transform (STFT) and Natural Language Processing (NLP), and the machine learning algorithms. Sect. 6 reports the experiment results, discussed in . Finally, in Sect. 7, we draw conclusions and propose future works.

Background
In this section, we discuss the medical background of the research and different types of seizures. Specifically, we introduce the "10-20 system" standardized EEG electrodes placement, the normal and abnormal seizures, and the classification of different types of seizures.

EEG electrode reference placement
Electrodes connect the EEG machine to patients for the recording of electric inputs generated from brain activities. The standardized electrode placement is represented in Fig. 1. It follows the international "10-20 system" which has been originally proposed in 1958 [17]. The name of the electrodes consists of two symbols. The first symbol is an abbreviation letter precisely pointing to the underlying six brain zones. The letters include F (frontal), Fp (pre-frontal), P (parietal), C (central sulcus), T (temporal), and O (occipital). Additional electrodes are placed behind the outer ear to record the prominent bone process using the letter A. The second symbol is a number (when on mid-line it is a letter z) specifying the left or right brain cortex: electrodes located on the scalp's right side are assigned even numbers, and odd numbers are used for electrodes on the left side. Smaller numbers denote positions closer to the mid-line, and larger numbers are farther away spots. Note that electrodes P7 and P8 are placed over the posterior part of the temporal, not the parietal region, and also F7 and F8 electrodes are not only close to the frontal cortex but also the pole of the temporal lobe.
The commonly used reference schemes of EEG electrodes are categorized into two classes, namely unipolar and non-unipolar references [11]. Unipolar references construct a neutral record, including Average Reference (AR), Linked Ears reference (LE), and Reference Electrode Standardization Technique (REST).
AR assumes that neuroelectricity transmits isotropically on a perfect layered spherical head, thus using the average of a finite number of electrodes as a reference.
LE reference is based on the assumption that due to the sites lack of electrical activity, the average of the potentials recorded is close to zero between two ears. REST is based on the fact that the same brain sources generate all EEG activities. Non-unipolar references are the potential differences of electrodes, including the bipolar and the Laplacian reference. Bipolar Reference shows the 1st derivative of potentials, which is the difference between two nearby electrodes' potential. Laplacian Reference show the second derivative of potentials, which is the difference between each electrode's potential and its nearest four neighbors' averaged potential.
The advantage of unipolar references is that the changes can be observed directly since it is the potential of the electrodes. The main disadvantage is that they are sensitive to common noise and artifact activity. If one electrode is contaminated, interpretation of activity in the brain area can be difficult. Non-unipolar references are not affected by noise as it is the difference of potentials, but this may attenuate the abnormalities observed in the recordings. If the derivation is zero, e.g., caused by equal effects of cerebral activity around electrodes, the interpretation can be challenging.

Normal and abnormal EEG
Normal EEGs are measurable both qualitatively and quantitatively. Normal EEG activities appear when people are not affected by any disease. Seizure events consist of abnormal brain activities, formally known as inter-ictal epileptiform discharges (IED). The EEG of IED is characterized by the unusual waveforms that deviate from the normal EEG on frequency, amplitude, morphology, localization, and reactivity. Figure 2 shows 10 s of normal and seizure EEG.
In Fig. 3, the five most common normal EEG activity frequency bands α, β, θ , δ, γ are represented. Each band may have a different interpretation, that can be described as follows: (i) Alpha rhythm (α) : frequency between 8 and 12 Hz.
It is more prominent in the occipital regions of an adult brain and can be observed in amplitude during relaxed and eyes-closed wakefulness. When eye-open and mental alert, alpha activities decrease in amplitude and demonstrate reactivity. Alpha variants are the mixture of the alpha rhythm with other rhythms, which have distinct morphology but, in another way, exhibit the same reactivity and localization. (ii) Beta rhythm (β) : frequency between 12 and 30 Hz.
It is primarily seen in the frontal and central areas of the adult brain. It also exhibits a gradual increase It is prominently seen in the central, parietal, and temporal parts of the left side scalp recording. Theta rhythm can reflect the abnormal activity in adults during wakefulness and is frequently observed in adults in sleep state. (iv) Delta rhythm (δ) : frequency between 0.5 and 4 Hz.
It is most predominantly found in adults frontally and in children posteriorly. Delta waves are associated with the deepest levels of the sleep stage and have a healing effect on the body and brain. (v) Gamma rhythm (γ ) : frequency between 30 and 50 Hz. It is seen in the cerebral cortex with cogni-tive and motor activities. Visual stimulation and meditation could increase the amplitude of gamma rhythms. It is often observed in the seizures ictal phase and prevalent in seizure onset. Altered gamma oscillations are regularly detected in brain disorders like Alzheimer's disease besides epilepsy.
Abnormal EEG activity is often prevalent in people with neurological or other diseases and absent from normal individuals. IED is the abnormal synchronous electrical discharge that originates in epileptic focus with a group of misfunctioning neurons [18]. Sharps and spikes are the prominent abnormal EEG waveforms and manifest as pointed peaks, serving as biological markers for either focal or generalized epileptogenesis. Spike waves are transients often exhibit between 20 and 70 ms. Sharp waves are similar but last longer with typical duration of 70-200 ms. Besides duration, sharps and spikes can have varying waveforms, like the voltage, frequencies, etc. Their occurrence can be single or repetitive, and distribution can be focal or general. The appearance of sharps and spikes is asymmetric, with initial deflection primarily as a sharper slope. The observation can be isolated waveforms or can be followed by slow waves. The subtypes can be divided by multiple ways. For example, by localization, there are temporal/centrotemporal/occipital/generalized spikes and sharp frontal waves; frequency spikes and sharps are associated with various frequency ranges, like 6-Hz spike-and-wave, polyspikes, and 14-and 6-Hz positive bursts, etc. The spike-and-slow-wave complex is the occurrence of a spike followed by a longer duration slow-wave, with varying frequency and amplitude and often distinct from the underlying background. A sharp wave can be the initial waveform rather than a spike. A sharp-and slow-wave complex is identical to the spikeand slow-wave complex, except that a sharp-wave succeeds the slower and broader wave. In these discharges, the slow-wave that follows may symbolize inhibition and subsequent hyperpolarization of cortical neurons, which accompany the initial synchronous depolarization [19]. For epilepsy patients, the above abnormalities are routinely observed between seizure periods and suggest an underlying propensity toward seizures; nevertheless, the abnormalities during a seizure do not result in observable clinical behavior for certain.

Seizures types
Seizures and epilepsy are classified from International League Against Epilepsy (ILAE) using modern era's terminology and concepts [20]. The two broader types are defined as generalized and focal seizures. Generalized seizures arise in neuronal networks distributed bilaterally, while focal seizures are limited to one hemisphere. Seizures may propagate from partial to generalized state, when the neuronal network is initially partly altered and may became complete dysfunctional at a later stage. Table 1 reports a selection of seizure categories, together with their symptoms descriptions. For clarity of presentation, the list is partial as it includes only the seizure types included in the dataset used in this work.

Seizure classification with EEG
The abnormal EEG recordings of patients who suffer from seizures contain four states: pre-ictal, ictal, interictal, and post-ictal. Pre-ictal and post-ictal are the signal portions before the seizure occurs and after the seizure diminishes, respectively. Inter-ictal is the abnormal signal activities between epileptic seizures, and ictal is the abnormal signals during an epileptic seizure [21].
The electrographic signature of a seizure is composed of spikes and sharps complexes and other abnormal activities that can be inspected over a longer duration compared to its exhibits during inter-ictal periods. Occasional transient waveforms are the signature of inter-ictal activities in EEG. It can exhibit either isolated spikes, sharps, or spike-wave complexes. IED generally supports the diagnosis of seizure disorders such as: (i) Partial seizures: EEG in partial seizures have two or more distinct phases, which are metamorphic. The common pattern consists of a series of spike-and sharp-waves, mixed with rhythmic waves, also with amplitude attenuation. The frequency and amplitude change dynamically of the waveforms when the seizure spreads in brain regions. At the ends of seizures, the frequency of sequential spikes or rhythmic waves will diminish to a slow spike-wave pattern. Temporal lobe seizure often with initial alpha or theta frequency range with a lesser proportion of slower wave occurring. Extra-temporal seizures frequently start with the beta band. The metamorphic patterns often follow post-ictal slowing delta, suppressing or activating focal spikes. Moreover, the electrodecremental events observed focally can localize the seizure onset zone. It also reflects high-frequency firing or intense neuronal depolarization. However, generalized electrodecremental events that are not ictal prior to focal seizures may epitomize cerebral alter that lead to focal seizures. Note that simple partial seizures with sensory symptoms rather than motor symptoms may not be distinguished in the EEG activities up to 80% of the time. But using more close-spaced electrodes, the ictal can be recognized. (ii) Generalized seizures: absence seizures have isomorphic and stereotyped features. The frequency and amplitude will change with seizure progression, though. For example, spike-wave discharges may start with 3.5 to 4 Hz at onset and diminish to 2 to 3 Hz, and the spike amplitude will also reduce at the subsequent of seizures. Diffuse polyspike wave complexes can precede tonic-clonic seizures. Ictal signals during the first phase often have generalized attenuation of rhythmic waves and increase in voltage gradually, then evolve into polyspikes. The second phase often shows slow waves mixed with paroxysmal spike activities. Then there is the gradual recovery of rhythms following generalized attenuation in the post-ictal period. Tonic seizures exhibit generalized paroxysmal fast activities or diffuse voltage attenuation with associated sharp and slow-wave complexes. Myoclonic seizures often companies with 10 to 15 Hz polyspikes and slow waves. Generalized atonic seizures may show 2-3 Hz spike-wave discharges or may not be associated with any EEG change.
However, the EEG may not capture all of the ictal activities because the technique limitations. For example, the skull and scalp may filter out some frequency waveforms, and the placement of recording electrodes may shift the distance and orientation of the seizure focus. Despite the limitations, seizures recorded by EEG can provide helpful information regarding the seizure type and focus.

Related work
The advance in machine learning boost its application in biomedical related field. Seizure detection have been widely studied from different perspectives and using different features. For seizure detection the discriminative features consist of morphological, biological and rhythmical features. Morphology features are used to detect and differentiate the components of amplitude and fundamental frequency in EEG waveform [4,22]. Biological features, such as synchronization likelihood, help to distinguish epileptic seizure activities from non-epileptic background activities [23]. Rhythmical characteristics include features in time, frequency or combined domain.
In addition, to fit the non-linear and non-stationary nature of brain signals and capture the changes reflected in EEG, research has been made to include magnitude components of signal in time domain [24], spectral representation, magnitude from signals in frequency domain [5,25], and time-frequency domain features [26,27]. In this section, we first introduce previous works of frequency bands' role related to brain functionality for abnormal EEG detection. Later we will present the seizure classification approaches for EEG channel selection with noise and computational power reduction.

Frequency bands selection
Traditional signal processing techniques are being applied for extracting the morphology patterns that constitute an epileptic seizure. Epileptic EEG recordings of the spike and sharp-wave complexes can be easily distinguished by morphological characteristics of waveforms' amplitude, shape, and duration. The grounding idea is the geometric difference between spikes and background activities such as the slope's distinctive attributes and sharpness, height, and length of waves. In morphological analysis, EEG waves are often decomposed to smaller physical parts like two opposite half-waves [28][29][30], and structure divergence of background activities and spikes complex can be observed.
One of the first automated seizure detection algorithms developed by Gotman in 1982 [4] analyzed signals morphologically. Their system first breaks down EEG signals to half-waves and searches for morphology-based features, particularly the epileptiform spike-and sharpwaves in the recording of 16 bipolar channels. They applied frequency thresholds between 3 and 20 Hz and relative amplitude with a dynamic baseline of background window in time domain features. A seizure is declared when the degree of rhythmicity for at least two channels exceeds the thresholds and lasts for four seconds. The algorithm successfully detected seizures with rhythmic activities with a determined threshold, but the algorithm fails when seizures consist of a mixture of frequencies or amplitude. Moreover, since rhythmic activity can be induced by normal or artifact bursts other than pathologic, the algorithm's detection may not be associated with seizures. Further studies found that the key to morphological analysis is to select a proper filter, restraining the background activities while retaining spikes. Nishida et al. [22] presented a detection method using a morphological filter, with the basic algorithm of open-closing morphological operation and structure elements of second-order polynomial functions. Pon et al. [31] proposed a mathematical morphology approach plus wavelet transform to detect bi-directional spikes with a circle structure element. Xu et al. [32] improved morphological filter with differences of their geometric characteristics to separate spikes from background activities. EEG enhancement strategies have been introduced in various works with the aim to better detect the spikes, to increase the candidate spikes, to minimize the missing seizure events, and to minimize the false selection [33][34][35][36].
The analysis of abnormal EEG spikes and sharps is the gold-standard to diagnose seizures, while the background activity has been scarcely studied. The main reason is that by applying visual inspection of brain signals the abnormalities in background EEG (α, β, θ, δ) cannot be distinguished. However, studies have shown that the background activities contain vital information about function and dysfunction of the brain in human epilepsy [7]. The seemingly normal EEG background bands may include evident abnormalities that can be used as a significant prognostic tool.
Alpha rhythm slowing in epilepsy was observed to be associated with mental deterioration [37]. Peak alpha frequency variability has been found between epilepsy patients and the control group, with a lower alpha frequency in the epilepsy group [38]. When the dependencies on antiepileptic drugs are ruled out, the epilepsy biomarker was sensitive to alpha rhythm abnormalities [39]. Alpha spectral power shifts from high to low in both focal and idiopathic generalized epilepsy patients compared to healthy subjects, indicating poorer seizure control [40]. Alpha oscillations have been used as an index to the cortical-subcortical brain network function abnormally in photosensitivity epilepsy patients [41].
Theta signals association with epilepsy has been demonstrated as well. Theta bands have been found to be positively related to the number of epileptic seizures in patients with brain tumors [42]. When monitoring interictal activities and theta oscillations in parallel, spatial deficits correlated with a decrease of theta power while non-significantly related to inter-ictal activities in rats model of temporal lobe epilepsy [43].
The delta oscillations have been proved to have a high correlation with epilepsy. The asymmetry of the delta signals can be used as a biomarker to identify the epileptogenic zone [44]. Temporal intermittent rhythmic delta oscillations can be a signature of focal epilepsy [45,46]. The delta slow waves are often prevalent in patients with uncontrolled seizures [47], and inter-ictal regional delta slowing has also been found to correlate positively with temporal lobe epilepsy patients' surgical outcomes [48,49].

EEG montages selection
The EEG montages refer to the electrodes located on the patients' scalp. Montages are named consistently with the locations of brain cerebrum, with abbreviations Fp (prefrontal), F (frontal), C (central), P (parietal), T (temporal), and O (occipital). A large number of montages (often ranging from 19 to over 100) are used when performing different tasks such as emotional response analysis, sleep recordings and drug effect diagnosis. For efficient analysis, it is vital to pick out the montages which carry the most discriminative information. Neurologist's selection with prior knowledge demonstrates the effectiveness of seizure detection with only a small number of montages [12,13]. In computer science, the selection of EEG montages is widely studied [10] for faster seizure detection and noise removal. However, the underlying principles guiding the montage's selection by neurologist and computer scientist is different. Computer scientists generate a subset of electrodes from whole set using various statistical measures like variance and entropy as well as using techniques such as power spectral estimation and wavelet transform. The selection made by neurologist is directed by the experience in diagnosing the particular disease as well as by the underlying knowledge of which brain areas generated the abnormalities in electric signals.
The montage selection has manifold objectives including the reduction of model computational complexity and model overfitting. By utilizing the montage that contains significant features, the associated brain areas of the montages can be identified. The specific regions of the brain contain vital information about the Seizure Onset Zone (SOZ) that may reflect where IEDs originate.
Several studies have shown the functional and topological change of the brain network during the inter-ictal, and ictal phase [50][51][52]. Burns et al. [53] studied the network structure of epilepsy patients' brains by constructing a graph with intracranial electrocorticographic (ECoG) recordings. The nodes are electrodes, and edges are node pairs with associated frequency bands' coherence weight. They found that the brain network dynamics can be characterized into a finite set of states, where the seizures' progress can be defined using a consistent sequence of the sub-states. Moreover, during the substates, the nodes are separated where a subset of nodes are isolated, and this subset of nodes can identify SOZ with high sensitivity and specificity. Martinet et al. [54] analyzed the brain dynamics during seizures at microscopic and macroscopic level. Their results indicate that the distance is a vital factor: the electrical voltage of activities decrease with longer distances in spatial scales coupling, and the coherence of waves propagation increase is dependent on distance.
With the supporting evidence of seizure related to a specific part of the brain cortex, further studies on seizure identification unitized the information of brain areas with recording montage references. Using a subset of electrodes to distinguish seizures have been proved from both vivo experiments [12,13] and signal processing by machines [14][15][16] to be a plausible approach. With the ultimate purpose of implementing the seizure detection models in wearable or invasive devices, machine learning-based automatic channel selection is targeted to reduce the computational cost of models. The typical procedure in channel selection consists of three steps: electrode subset generation, subset evaluation and result validation. Subset are first generated and then evaluated. Subset generation has been explored using complete search, sequential search, or experts' generated. In literature five main approaches can be found for subset generation: filtering, wrapping, embedded, hybrid, and human-based techniques [10]. Truong et al. [15] select the channels by comparing the spectral power and correlation in both frequency and time domains between the electrodes pairs. Their method outperforms the other methods without channel selection two times faster and maintaining the same level of accuracy as well as area under curve. Ibrahim et al. [16] used a statistical approach in time-domain signals for channel selection. They sliced the data by using a sliding window to each 10 seconds non-overlap segments, then the probability density functions (PDFs) of derivatives, local means, local variances, and medians is calculated for each segments. The resulting multiple bins PDFs are studied individually and compared with the pre-defined thresholds in prediction and falsealarm probability. After the comparison, the bins are selected from certain channel for seizure prediction.

Dataset
In the past, seizure prediction studies using EEG signals have been limited due to insufficient standardized and qualified data [55]. The EEG data have often been acquired from Intensive Care Unit (ICU) [56], presurgical inpatient [57], animals [58], and implantable devices [59]. Data usefulness was limited by the relatively short recording duration and inter-ictal time sampling of patients' and animal models' concerns. Furthermore, the data were primarily held at the institutions where it was acquired and not made available for the community to use.
Recently, the opening and sharing of longer-lasting and high-quality publicly accessible chronic EEG datasets, such as the CHB-MIT Dataset [60], the UPenn and Mayo Clinic seizure detection dataset [61], and the Temple University Hospital (TUH) Dataset [62], has made possible the advance of seizure prediction algorithms.
The TUH Seizure Corpus (TUSZ) assembles EEG data in clinical settings from archival and ongoing records at Temple University Hospital [63]. TUSZ is the most extensive open-source corpus both in terms of quantity and heterogeneity, with a wide variety of seizure morphology in aspects of frequency, amplitude, and onsets. The TUSZ is organized by patient and session in a hierarchical file tree structure. Each patient folder is composed of sub-folders corresponding to their recording sessions. Each session has EEG signals stored in a standard European Data Format (EDF) and the corresponding clinician reports in text format (TXT) collected by certified neurologists.

EEG data
TUSZ v1.5.2 released in May 2020 comprises 3050 seizure events. More specifically, the EDF files include the following metadata: anonymized patient ID, age (in years), gender, recording date, and per-channel information (labels, sample frequency, channel physical dimension/min/max, prefiltering channel conditions). It is also worth noting that the EDF files include a varying quantity of channels and sampling rate [62], where channels consist of EEG-specific channels coupled with supplemental channel information such as detected bursts and photic stimuli. EEG signals have been sampled at 250 Hz, 256 Hz, 400 Hz, or 512 Hz.
The EEG data of each patient include information about: (i) assigned numbering: patient ID, the session numbers of each patient, the file numbers of each session, filename (consist of electrode reference, patient, session, and file IDs); (ii) EEG type and subtypes: Epilepsy Monitoring Unit (EMU), Intensive Care Unit (ICU) including eight subtypes (burn unit, cardiac intensive care, intensive care unit, neuro-ICU facility, neural surgical ICU, pediatric intensive care unit, respiratory intensive care unit, surgical intensive care unit), inpatient but not ICU (Inpatient) including three subtypes (emergency room, operating room, Inpatient but not ICU or outpatient), routine EEGs (Outpatient), EEG report is not informative (Unknown); (iii) EEG label: LTM-or-Routine, Normal-or-Abnormal; (iv) description of seizures: number of seizures/file, number of seizures/session, seizure start time, seizure stop time, seizure type.
In this study, we use four selected categories: filename for extracting the data, seizure types for clusters, seizure start and stop time for ictal/inter-ictal label. TUSZ contains three types of electrode reference, all constructed from neutral record listed in Table 2. Here the 03_tcp_ ar_a is 01_tcp_ar without electrode A1, A2 that connect to the left ear and right ear. The three types of electrode references used to record brain signals in TUSZ have fundamental hardware and interpretation differences as described in . This variability in referential montages may affect the performance of machine learning algorithms [64].
Statistical analysis has been performed to investigate the latent divergence caused by three electrodes reference in TUSZ. For selecting the suitable subjects that can exclude the variation caused by individual difference, seizure type, and health state, filters are applied as follows: (1) subjects need to have recordings from three different electrode references for parallel comparison; (2) subjects need to have same EEG type, which means same health state; (3) recording files do not contain seizures, as seizures alter the recorded signals. Two particular subjects with patient ID 4671 and 6514, both with EEG type as ICU, have been selected for this analysis.
The statistical variation is measured in three domains of amplitude, time correlation, and frequency correlation with mean, standard deviation (STD), minimum, and maximum. Before the measurements are taken, data are pre-processed by resampling and unit scale normalization. Fast Fourier transform (FFT) is applied to transform data in the time domain to the time-frequency domain. The correlation in time and frequency is calculated by taking eigenvalues on the correlation coefficients matrix. The result is provided in Table 3. From the table, it is clear that for the three types of electrode references, the mean and standard deviation greatly deviated in the amplitude domain, but the basic statistics remain at the same level in time and frequency correlation. The results suggest that the recordings from the three types of electrode references have high inter-variability and low intra-variability. Thus, the classification is performed under each electrode reference, respectively.

Textual data
The clinical reports provided together with the signal file in TUSZ are critical for seizure diagnosis. This feature renders the TUSZ unique of its kind as no other dataset provides the information. The clinical reports document medics' knowledge of the patients. The information given contains introduction, clinical history, medications, record description, and seizure impression. Primary exploration and further feature extraction have been applied to textual data to utilize the text information delivered in the reports. Term Frequency-Inverse Document Frequency (TF-IDF) in Natural Language Processing (NLP) has been carried out to calculate pairwise similarity among the clinical reports of different seizure types, the calculated similarity heatmap is shown Fig. 4. Moreover, top frequency words by TF-IDF are recorded and compared for every seizure type. Full list of top 50 keywords can be found in Tables 11 and 12 in Appendix. From the list four common observations can be made based on the terms with high occurrence: (1) general words: seizure, activity, EEG, clinical, record/ recording, etc.; (2) descriptive words of seizures: spike, sharp, complex, discharge, etc.; (3) location words: right, left, frontal, temporal, hemisphere, posterior, anterior, etc.; and (4) background signal frequency words: alpha, beta, theta, delta (δ, θ, α, β).
For observations (1) and (2), it is self-evident these words are frequent in clinical reports to describe the abnormal brain signals which lead to a seizure event and no more information can be derived. Nevertheless, the (3) and (4) are the informative terms that carry the knowledge of what the medics are inspecting on the given EEG signals and how they would ultimately diagnose the seizure. For example, some of the descriptions of the record from the clinical reports are quoted here: "A prominent increase in beta noted at 3 a.m. ", "A status epilepticus pattern with prominent epileptiform activity from the right occipital and temporal region. ", "There is a posterior dominant rhythm of 8 Hz, 30 to 50 V with a small amount of low voltage, frontocentral beta activity. ". The description reflects the thinking process of experts when they examine the EEG data. To mimic the thinking flow of medics when dealing with seizures, we designed a classification framework that aims to classify the ictal/ inter-ictal state in a similar way. The process is introduced in Sect. 5.

Methodology
This study aims to mimic the reasoning process of medics when they diagnose seizures and integrate the expert's knowledge into the classification process when classifying ictal/inter-ictal signals. Ultimately, the goal is to build a personalized seizure classification system. Not all the EEG signals are essential when identifying seizures from observation and using common sense. In many studies, the focus is on identifying signs of seizures from the seemingly abnormal waveforms like the spike and sharp waves. Nevertheless, as discussed in Section , previous works have demonstrated how the seemingly normal background bands play an important role when diagnosing seizures and how different areas of the brain are activated when a seizure happens.
Experiments are designed with pre-process of input data, including time-frequency data transformation, language-signal mapping, inputs selection, and classifier evaluation as shown in Fig. 5.

Time-frequency transformation
Several methods are available to perform time-frequency transformation of signals, such as wavelet transform and short-time Fourier transform (STFT). Previous studies have shown that STFT should be preferred over wavelet transform for determining epileptic seizure activity in real-time [65]. STFT is used for spectral analysis of EEG signals and it can transform signals between the time and frequency domain due to its timeshift invariant. Specifically, for every chunk of signals in the time domain by sliding window, the one-dimensional Fourier transform is applied, resulting in a two-dimensional time-frequency representation of the signal. Sliding windows are used to partition time-series data into a finite number of segments. The window length is the trade-off between spectral and temporal resolution, where the longer window preserves more spectral information and less temporal information. The window length is analyzed with sliding action in time and the changes of spectral behaviors in each block are observed locally. Different sliding window lengths are tested and determined to balance time and frequency resolution in this study. Finally, fixed-width sliding windows of 2 s with 50% overlap are implemented in this study. The number of samples per window is related to the specific sampling rate (250 Hz, 256 Hz, 400 Hz, or 512 Hz) of the specific recording file.

Natural language mapping of signals
Frequency domain EEG signals are further filtered to detect seizures based on knowledge and build a resourceutilized classifier. More specifically, information from the clinical reports is extracted and mapped on the EEG signals. Basic Natural Language Processing (NLP) techniques are used. First, the noise like punctuation is removed, then sentences are tokenized into a list of words. After that, each token was converted to lower case for more accurate selection purposes. Finally, two lists of keywords are targeted: background frequency bands (alpha, beta, theta, delta, gamma) and zones of brain areas (prefrontal, frontal, temporal, parietal, occipital, central). At the end of the NLP pipeline we will produce the two lists of words from each patient documents if the above keywords and their variants are present.
The signals are then mapped using the extracted lists of sub-bands and sub-zones for the classification process. The band list mapping is guided by the frequency range of alpha (8-12 Hz), beta (12-30 Hz), theta (5-8 Hz), delta (0.1-5 Hz), and gamma . The zone list is guided by the annotation of the "10-20 system" of Fp (pre-frontal), F (frontal), C (central), P (parietal), T (temporal), and O (occipital). The mapping is performed

Inputs selection
In order to compare the efficiency and resource utilization of the mapped signals, six types of input data were tested: (1) Zoned selected bands: selected electrodes and selected frequency bands extracted from clinical reports; (2) Selected bands: input selected frequency bands extracted from clinical reports; (3) Zoned background bands: selected electrodes and background frequency bands; (4) Background bands: input background frequency bands; (5) Zoned whole frequency: selected electrodes and whole frequency range without any selection. (6) Whole frequency: input whole frequency range without any selection.
The "Zoned" feature extraction (approaches n. 1 and n. 3) use selected electrodes (Fp, F, C, P, T, O) corresponding to the brain zones (pre-frontal, frontal, central, parietal, temporal, occipital). The "Selected Bands" feature an extraction (approaches n. 1 and n. 2) use selected frequency sub-bands (selection of δ, θ, α, β, γ ). The approaches n. 3 and n. 4 evaluate the role of all background band signals (δ, θ, α, β, γ ) in seizure and normal conditions. Approach n. 1 evaluates the classification power using the data points as limited as possible. Approaches n. 6 use all available data points without further selection, including abnormal seizure waveforms like spikes and sharps.

Classification models
In this work, we apply both traditional machine learning and deep learning models for the seizure classification. The traditional machine learning models Random Forest (RF) and Support Vector Machines (SVM) are utilized due to their ability of better handle imbalanced datasets, the need of fewer resources. The deep learning model include a multi-layer perceptron (MLP) artificial neural network with various topologies. Random Forest (RF) [66]: The RF classifier is formed by a combination of tree classifiers. Each of the trees is formed by a random vector selected separately from the input vectors. Each tree appoints a unit vote for one most sampled classes to classify an input vector. Because RF is a tree-based ensemble where the final classification is then made of a majority vote yielded by each ensemble of trees, a RF may be defined as a group of Decision Tree (DT). Each instance in the dataset is classified by each tree. The final classification decision is made by averaging the probabilities of class assignment by produced trees. An unlabeled instance is evaluated by all DTs created in the ensemble, and each DT votes for a class, the most voted class will be the final classification decision of the instance. The tree's growth each time towards the maximum depth using a combination of data features. Thus, by growing RF to the set number of trees, the algorithm generates trees with high variance and low bias. Support Vector Machine (SVM) [67]: The SVM decides the separation between two classes by separating observations with an optimal hyperplane using statistical theory. In linear separable classes, the optimal hyperplane is the one that gives the widest margin of the two types. Margin is measured by the vectors that are closest to the hyperplane. Therefore, the vectors are named 'support vectors' and are have a shorter distance from the hyperplane than other vectors in the class. SVM focus on maximizing the margin and minimizing the misclassified vectors. A self-defined tuning parameter in practice restrains the trade-off between them. Maximizing the margin can be solved using optimization techniques like standard Quadratic Programming (QP). In non-linear separable classes, the optimal hyperplane is determined in a higher-dimensional feature space where the classes can be separable linearly. Kernels project functions that allow SVM to find optimal hyperplanes in higher-dimensional space without knowing the explicit transformation and construction. A Radial-Basis Kernel (RBF) is one of the most popular non-linear mapping functions. The decision region of an RBF can be the union of several disjoint areas. Since the determination of the optimal hyperplane in its associated high-dimensional feature space yields non-linear decision boundaries that may be necessarily discontinuous.
Multi-Layer Perceptron (MLP) [68]: The MLP is an artificial neural network feed-forward model consisting of three types of layers: one input layer, one output layer, and at least one hidden layer. Each layer is composed of simple computational units called neurons. Neurons between layers are interconnected. The input layer receives the data and passes the data to the first hidden layer. The output layer could be a list of categories or signals mapped to the input data. The hidden layers minimize the loss of the model by adjusting each neuron's weights and biases and extracting the salient features that have a predictive power of the output. Nonlinear and continuously differentiable activation functions are applied at each hidden layer to transform the data and decide whether the neuron needs to be activated or not. Rectifier Linear Unit (ReLU) is now used as a default activation function. It has a lower running time and reduced likelihood of vanishing gradient, which is a problem often seen in other activation functions like sigmoid and hyperbolic tangent (tanh).

Experiments
In this section we present the experiment results. The results include the aggregation of keywords extracted with NLP, the seizure classification with six different inputs, and the ablation study. Table 4 reports on the sub-band keyword's aggregation for each seizure type. Table 5 reports the electrode reference brain zones aggregation of keywords. We count the binary appearance of keyword in each clinical report (e.g., if a keyword occurs more than once in one report the count is one). Each seizure type contains a varying number of clinical reports that affects the number of discovered keywords. A useful additional indicator for analysis is the proportion of the keywords of specific subband and electrode reference zones indicated by the relative percentage in brackets.

Natural language keywords aggregation
From the natural language processing we obtain interesting information about the clinical reports. We can find that the GNSZ clinical reports are characterized by Theta and Beta frequency bands together with Temporal (T) and Frontal (F) electrode references. FNSZ clinical reports point to Delta and Theta bands maintaining the same Temporal (T) and Frontal (F) brain reference zones with varying relative percentages comparing to GNSZ. We can also notice that the Gamma frequency band is not present in many reports, just a small percentage (0.9%) of the big set FNSZ contains this information. The brain zone references electrodes are evenly present in different reports with the exception of pre-frontal (Fp) electrodes that are only found in 0.2% of FNSZ reports.
In the following experiments, a fixed number of electrodes for every patient have been selected as grounding. Electrode references 01_tcp_ar and 02_tcp_le contain 21 electrodes in total and five electrodes (A1, A2, CZ, C3, C4) have been used as fundamental for all the patients. The electrode reference 03_tcp_ar_a contains 19 electrodes in total and three electrodes (CZ, C3, C4) have been selected.

Seizure classification 6.2.1 Experimental settings
In the classification experiments presented in this work, the "positive class" is the inter-ictal EEG phase where no seizure happens, and the "negative class" is the ictal EEG signals where seizures are observed. Training of the classifiers has been performed using fivefold cross-validation, where the dataset is split in

Evaluation metrics
Seizure detection is a binary classification problem. The experiments' results have been evaluated using accuracy (ACC), area under curve (AUC), sensitivity (TPR), and specificity (TNR) scores. Given the basic statistics of true positive (TP) that measures when the predicted seizure corresponds to a seizure in the dataset, false positive (FP) measurement of the predicted seizure is non-seizure in the dataset, true negative (TN) measuring the predicted non-seizure that is non-seizure, and false negative (FN) that measures when the predicted non-seizure is a seizure. The evaluation metrics can be calculated using above measures as follows: • Accuracy (ACC):

Model performance
After the NLP extraction of textual data and EEG mapping of EDF data, seizure signals are classified by using the six inputs described in Section .
The model parameters have been tested in multiple configurations. For RF, the total number of trees to be generated by the model has been set to 100. The model was also tested with 50 trees, where the model was found to be overfitting, and 150 trees, parameter that increased the execution time with minor improvements of the results. The minimum split of each tree was evaluated at the default value of 2 and with the parameter set to 5. However, since the dataset is large, when using the parameter set to 2 the model was underfitting, thus the minimum split of 5 performs better. To evaluate the quality of the split, we select the entropy measure as it is more sensitive to impurity and fit the data better than GINI index. For SVM, given non-linearly separable data, RBF kernel was selected. The regularization parameter (lambda) between 500 to 1200 and kernel coefficient (gamma) between 1e−10 to 10 were tested using gridsearch. The model performs well with tight margin (high lambda value), but overfitting is detected when the influence of the support vectors have a large radius of the area (high gamma value). The regularization parameter of 1000 and kernel coefficient of 1e−9 were selected also taking in consideration of computational time. For MLP, the model was tested with 1 to 3 hidden layers and number of neurons at each layer from 16 to 1024. When trying to balance the fit of the model, one hidden layer with 256 neurons was found to perform the best. The other parameters were set to the default settings of the scikitlearn library. We omit the complete results, listing below the detailed summary of the model parameters used in the experiments: • RF: A random forest classifier with entropy impurity, number of trees set to 100 and minimum number of sample split set to 5; • SVM: A support vector machine classifier with RBF kernel, regularization parameter of 1000 and kernel coefficient of 1e−9. • MLP: A multi-layer perceptron classifier with a single hidden layer with 256 neurons, ReLU activation function, Adam weight optimizer, regularization parameter of 1e−4.
The performance of sub-bands selection is listed in Table 6. The results of sub-zones selection are listed in Table 7. The data sample size and recording time duration (in seconds) of the six inputs are listed for three different electrode references in Table 9. In Table 8, the results of both sub-bands and sub-zones with six inputs of seizure type GNSZ are present as a representative sample. In this section, we present the results of the four seizure types GNSZ, TCSZ, FNSZ and CPSZ that are the most sizeable in the dataset. The results of seizure types ABSZ, MYSZ, TNSZ, SPSZ are listed in Appendix.

Ablation study
We performed an ablation study for the pre-defined electrodes in the zone list when classifying the seizures. More specifically, only five electrodes (A1, A2, CZ, C3, C4) are used for 01_tcp_ar and 02_tcp_le, and three electrodes (CZ, C3, C4) are used for 03_tcp_ar_a to conduct the experiments. The results of the study are shown in Table 10.
The objective for ablation study is threefold. Firstly, to evaluate the effectiveness of the zone selection by comparing the results with selected zones performance as shown in Table 6. Secondly, to assess the classification results by the essential zones of seizures as reported in previous studies [12,13]. Thirdly, to provide the ground electrodes data for patients recording that does not have a specific description of zones in the clinical report. The pre-defined electrodes are consistent across all seizure types. The results can be compared both horizontally across seizures and vertically within seizures. By observing from Table 10, it is clear all four major seizure types have no data in set_zoned_selected_bands. Further, for 03_tcp_ar_a, both selected bands and background bands have no data samples. Distinguishing the   Table 10 and Table 6, it is possible to recognize the potential of selected electrodes besides the pre-defined ones. The whole frequency data ablation results are promising, reveal the power of the essential electrodes, and confirm the results when compared to previous studies.

Discussion
The classification experiment in the study evaluated two main type (sub-bands and sub-zones) with six predefined inputs (Zoned Selected Bands, Selected Bands, Zoned Background Bands, Background Bands, Zoned Whole Frequency, Whole Frequency) to answer the three research questions.  " performance metrics of seizure type GNSZ, TCSZ, FNSZ, CPSZ   RF  SVM  MLP   ACC  AUC TPR  TNR  ACC  AUC TPR  TNR  ACC  AUC TPR  TNR   GNSZ Table 8 Performance metrics of GNSZ with "sub-bands" and "sub-zones" selection RF  SVM  MLP   ACC  AUC TPR  TNR  ACC  AUC TPR  TNR  ACC  AUC TPR Table 9 Size and recording time duration of "sub-bands" and "sub-zones" selection of seizure type GNSZ, TCSZ, FNSZ, CPSZ The extracted keyword aggregation presented in Table 5 shows the importance of the recording EEG signals in frontal, temporal and central zones of the brain cortex. Temporal lobe electrodes often mentioned in clinical reports, and the term prevail in seizure type GNSZ, TNSZ, CPSZ, CPSZ. Besides temporal lobe, electrodes placed at central or frontal are carrying significant functions with a large proportion that may affect seizure classification performance. Also, the occurrence of the keywords in frequency bands are listed out in 4. Theta is predominant in GNSZ, FNSZ and CPSZ, while beta is often observed and mentioned in seizure type TNSZ and TCSZ when compared to other frequency bands, delta does not outweigh other bands for seizure classification. The two aggregations also show that pre-frontal brain cortex and alpha activities are often neglected in expert's reports. Table 6 with inputs 2, 4, and 6 aims to answer RQ1, namely, how selection of α, β, θ , δ, γ influence the classification. Table 7 with inputs 1 3 5 tries to answer RQ2, Table 10 Performance metrics of ablation study for seizure types GNSZ, TCSZ, FNSZ, CPSZ   RF  SVM  MLP   ACC AUC TPR  TNR ACC AUC TPR  TNR ACC AUC TPR  TNR   GNSZ 01_tcp_ar  Set_zoned_selected_bands  0  With certain frequency filtered out the performance metrics drops in almost all the experiments, as it may be expected, with the exception of some minor points where it improved. However, for seizure classification problem, the measurement of accuracy and specificity does not play a vital role. On the other hand, important measures are the sensitivity refers to the model's ability to detect the real seizure as positive, and AUC indicates the model's ability to distinguish of ictal/inter-ictal. In general, the results of the Random Forest classifier outperforms all the other classification algorithms here considered. Although the sensitivity and AUC in sub-bands selection deviate from using whole frequency, the deviation of background frequency bands selection are acceptable given that the time of execution is reduced. This proved the effectiveness of selecting background frequency bands as a potential classification approach. Channel selection results are more consistent, bringing to a considerately more plausible approach. Comparing the three groups of sub-bands (input 1 & 2, input 3 & 4, input 5 & 6), with specific electrodes filtered out, the results achieve better results than using all the channels.
The results are more comprehensive when looking at both sub-bands and sub-zones selection together. The performance metrics of GNSZ, taken as a representative sample, are shown in Table 8 with the size of the six inputs listed in Table 9. The RF results are visualized in Fig. 6 for the reference electrodes 01_tcp_ar, in Fig. 7 for reference electrodes 02_tcp_le, and in Fig. 8 for reference electrodes 03_tcp_ar_a. From the charts is visually evident that the quantity of data is reduced significantly with each input configuration filters out information. The performance matrix for Zoned Whole Frequency and Whole Frequency only deviated around 0.01 for all the measures. The Zoned Whole Frequency approach achieves excellent classification results using nearly half of the quantity of data when compared to the Whole Frequency. The results can be replicated for other seizure types with similar performances. Additional information is reported in the results table and, for seizure CPSZ, is visually represented in Figs. 9, 10, and 11.

Conclusions and future works
In this work, we introduced a novel natural language processing approach to predict seizure using EEG data. The approach is based on the efficient selection of frequency By analyzing the patients' clinical reports, we integrated the prior knowledge into the classifier-building process, mimicking the authentic thinking process of experts' opinion for seizure diagnosis with EEG data. In particular, we classify seizure ictal/inter-ictal phases with three types of frequency band inputs: 1) the whole frequency range provided in data corpus; 2) the background frequency EEG bands (α, β, θ , δ, γ ) , and 3) the selected background bands based on individual's clinical reports extracted by Natural Language Processing (NLP). Together with the frequency band selection, we additionally used the scalp EEG electrodes reduction by NLP analysis.
The experiment results show that by integrating prior knowledge from experts to build individualized seizure classification models, interesting results can be achieved. Using prior knowledge for the selection of EEG electrodes and frequency bands influence the quantity of data that the classification model analyzed. This led to a more efficient classification of the input data, achieving excellent results with the selection of electrodes. Mixed results have been achieved when selecting the frequency bands.
Using the proposed approach, we introduced a novel methodology for patient-specific seizure detection method using frequency bands and selected electrodes. The algorithm is computationally efficient, compared to the whole band classification. Results show that using the proposed approach may lead to more efficient implementations of the seizure classifier to be executed on powerefficient devices for long-lasting real-time detection of seizures.
In future works, we will further explore the classification of EEG using more advanced NLP techniques on clinical reports, to extract additional information on the thinking process of medics when analyzing EEG data.

Appendix
Tables 11 and 12 report the top 50 keywords extracted from clinical report using the Natural Language Processing techniques described in narrative.   Performance metrics for seizure types ABSZ, MYSZ, TNSZ, SPSZ are shown in Table 13.   Table 16 shows performance metric of ablation study for seizure type ABSZ, MYSZ, TNSZ, SPSZ.